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■ Abstract. In this work we deal with a mechanism for process simulation called a NonDeterministic 

Stochastic Activity Network (NDSAN). An NDSAN consists basically of a set of activities along with 



precedence relations involving these activities, which determine their order of execution. Activity 
durations are stochastic, given by continuous, nonnegative random variables. The nondeterministic 
behavior of an NDSAN is based on two additional possibilities: (i) by associating choice probabilities 

i 

with groups of activities, some branches of execution may not be taken; (ii) by allowing iterated 
executions of groups of activities according to predetermined probabilities, the number of times an 
activity must be executed is not determined a priori. These properties lead to a rich variety of 
activity networks, capable of modeling many real situations in process engineering, project design, 



and troubleshooting. We describe a recursive simulation algorithm for NDSANs, whose repeated 
execution produces a close approximation to the probability distribution of the completion time of 
the entire network. We also report on real-world case studies. 

Keywords: activity networks, stochastic activity networks, nondeterministic activity networks, 
stochastic project scheduling problems. 

1 Introduction 

In this work we deal with a mechanism for process simulation called a NonDeterministic Stochastic 
Activity Network (NDSAN). An NDSAN consists basically of a set of activities along with prece- 
dence relations involving these activities, which determine their order of execution. This order is 
captured by a digraph with some special properties: the possibility of defining nondeterministic 
branches of execution, by associating choice probabilities with some activities, and loops of execu- 
tion, which specify the iterated execution of a group of activities according to predetermined loop 
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probabilities. These properties allow for a rich variety of activity networks, capable of modeling 
many real situations in process engineering, project design, and troubleshooting. 

There are two main types of activity networks. A deterministic activity network is represented by 
a precedence digraph whose topology remains fixed as the activities are executed. Examples of 
deterministic activity networks include CPM and PERT networks, see e.g. [10J. On the other hand, 
a nondeterministic activity network allows for the possibility of a dynamic topology. Examples of 
such networks are inhomogeneous Markov chains, GANs (Generalized Activity Networks) [1], and 
GERT (Graphical Evaluation and Review Technique) networks 

The duration of each network activity is given by a random variable. Thus, a fundamental problem 
is determining the distribution of the completion time of the entire network. For deterministic 
activity networks, this general problem is known as the Stochastic Project Scheduling Problem [3J. 

Our definition of NDSANs combines stochastic activity durations with nondeterminism. In an 
NDSAN, activities are represented by nodes, and an arc oriented from activity a% to activity aj 
means that the execution of a,- may only start after the execution of a, has ended. Nondeterminism 
is achieved, as indicated above, by means of two possibilities: (i) some branches of execution 
are not necessarily taken, and (ii) the number of times a group of activities is to be executed 
is not determined a priori. These additional possibilities are supported by the introduction of 
two new categories of nodes, namely decision nodes and loop nodes. A decision node associates 
probabilities with its out-neighbors and selects one of them to be executed accordingly; this selection 
is interpreted as one possible deterministic scenario among many. A loop node allows the repeated 
execution of a group of activities, the number of iterations depending on probabilities associated 
with the loop node. Loop nodes are particularly interesting to model refinement processes, such as 
quality control and error testing/correction. We also define junction nodes for adequately combining 
the two new constructions into the network. In Section 2 we define NDSANs formally, in terms 
of recursive construction steps that combine smaller NDSANs into larger ones via certain types of 
structured templates. 

In Section 3, we give an analytical description of the random variable T[D] associated with the 
completion time of NDSAN D. We assume that the duration of each activity aj in D is given by a 
continuous, nonnegative random variable Tj. The random variable T[D] is thus given in terms of 
the Tj's and the probabilities associated with the decision/loop nodes. 
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Although T[D] can be described precisely, we lack a closed- form expression for it and even numerical 
methods to find its distribution from such a description may be computationally too hard, especially 
when the number of activities is large. In Section 4, we describe a recursive simulation algorithm 
whose execution returns a single plausible value ("observation") in the sample space of T[£)]. 
Running the simulation algorithm a suitable number iV of times produces a close approximation to 
the probability distribution of The value of N can be obtained by using the same statistic 

as the Kolmogorov-Smirnov test, see e.g. [7] (Section 13.5), as we also discuss in Section 4. 

Section 5 presents two computational experiments. For each experiment, the result of the sim- 
ulations is shown as a frequency histogram together with a fitting curve that approximates the 
expected shape of the density of T[.D], an approximate probability distribution of T[D], and an 
approximate probability density of T[D] obtained from the approximate distribution. Section 6 
discusses ongoing work. 

In a recent related work, Leemis et al. [S] develop algorithms to calculate the probability distribution 
of the completion time of a stochastic activity network with continuous activity durations. In their 
work, activities are modeled by arcs and the networks are acyclic and deterministic (i.e., allow 
no variation in topology). The authors describe a recursive Monte Carlo simulation algorithm, 
which is network-specific and must therefore be rewritten specifically for each new network. Also, 
they provide two exact algorithms, one for series-parallel networks and another for more general 
networks whose nodes have at most two incoming arcs each. 

We remark that all the discussion on random variables in this work can be adapted to the case of 
discrete random variables. (In [T3], pp. 122-123, for example, an activity network with discrete 
activity durations is given.) 

2 Formal definition of NDSANs 

In this work, D denotes a digraph with n nodes and m arcs. If (v,w) is an arc of D, then node 
v is an in-neighbor of node w, whereas w is an out-neighbor of v. By disregarding arc orientation, 
we may also simply say that v and w are neighbors. A node having no in-neighbors (resp. out- 
neighbors) is called a source node (resp. sink node). If D is a digraph containing a single source 
(resp. sink) node v , then v is denoted by source(D) (resp. sink(D)). 
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An NDSAN is a special digraph whose node set is partitioned into four subsets of nodes: a subset 
S a = {a-i | 1 < i < n a } of activity nodes; a subset Sb = {h \ 1 < i < rib} of junction nodes; a subset 
Sd = {di | 1 < i < rid} of decision nodes; and a subset = {£j | 1 < i < n^} of Zoop nodes. 

An activity node ai represents a single activity (or task) to be executed in the network. The execution 
of a. L starts only after the execution of all of its in-neighbors has ended. When the execution of a.; 
ends, all of its out-neighbors start executing simultaneously. Each activity node has a duration 
(execution time) Ti, which is a continuous, nonnegative random variable. We assume that the 
execution time of an activity node does not depend on the execution time of any other activity 
node. That is, the Tj's are independent random variables. An activity node is represented by a 
circle. See Figure 1(a). 

A junction node bi is used for a syntactic purpose. It may have several in-neighbors, but it has 
a single out-neighbor v. When the execution of any in-neighbor of 6j ends, the execution of v is 
started immediately. In other words, bi acts simply as a "connecting point" of incoming arcs. A 
junction node is represented by a square. See Figure 1(b). 

A decision node di is used to select one particular branch of the execution flow, as described in 
what follows. By construction, all of d^s neighbors are activity nodes. It has a single in-neighbor 
a^ and ctj > 2 out-neighbors aj 1 , . . . , aj a . . The execution of di is assumed to be instantaneous, and 
consists of selecting exactly one of its out-neighbors, say aj k , as the next node to execute. The 
activity node aj k is selected by di with probability p\, k = 1, . . . such that Ylk=iPk = 1- A 
decision node is represented by a lozenge. See Figure 1(c). 

A loop node ti represents the usual iteration mechanism. By construction, li has a single in-neighbor 
(a junction node bh) and two out-neighbors (activity nodes a r and a,). After the execution of bh, a 
Boolean condition Ei associated with li is instantaneously tested: if E\ is false then a r is executed 
next, otherwise aj is. An array of real values associated with ti gives the sequence q\,...,q l p. of 
probabilities corresponding to consecutive passages through ti, in such a way that the probability 
that Ei is false at the kth passage through ti is qi. That is, the probability of exiting the loop 
at this point is 1 — q l k . We assume that q\. = in order to guarantee the termination of the loop 
in at most Pi consecutive passages through ti. A loop node is represented by a filled lozenge. See 
Figure 1(d). 
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(c) (d) 



Figure 1: Types of node: (a) activity node; (b) junction node; (c) decision node; (d) loop node. 

We are now ready to give the formal definition of NDSANs in terms of recursive construction steps. 
The base NDSAN is a digraph consisting of a single activity node. In a general step, NDSANs 
containing a single source node and a single sink node are combined to yield a larger NDSAN. 

The recursive construction steps are based on the following Substitution Rule: 

Substitution Rule: Let Dq be a digraph and {v i, v 2 , . . . , v^} a subset of its node set. Let 
D\ , D 2 , • • • D„ be NDSANs, each containing a single source node and a single sink node. 
Construct an NDSAN D by replacing V{ by Di, 1 < i < r], in such a way that every in- 
put (output) arc of Vi in Dq is an input (output) arc of source(Di) (sink(Di)) in D. Let 
Sub(D ,D 1 ,...,D v ) = D. 

Definition 1 An NDSAN is defined as follows: 

1. A digraph D consisting of a single activity node is an NDSAN, called the trivial NDSAN. 

2. Let Di,D 2 ,... D ri be NDSANs. 

2.1 If Dq is an acyclic digraph of node set {vi, . . . ,v v } containing a single source node and 
a single sink node (Figure 2(a)), then Sub(DQ, D\, . . . , D v ) is an NDSAN, called an 
acyclic NDSAN (Figure 2(b)). 

2.2 If Dq is the digraph in Figure 3(a), then Sub(Do, D±, . . . , D v ) is an NDSAN, called a 
decision NDSAN (Figure 3(b)). 
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(b) 

Figure 2: Construction of an acyclic NDSAN. 



2.3 If Dq is the digraph in Figure 4(a), then Sub(Do, D%, D2, -D3) is an NDSAN, called a 
loop NDSAN (Figure 4(b)). 

It is easy to see that the network Sub(Do, D\, . . . ,D V ) resulting from 2.1, 2.2, or 2.3 in the above 
definition contains a single source node and a single sink node, both activity nodes. 

Scope of the definition of NDSANs. Although other definitions of NDSANs may be possible, 
we believe that Definition [T] not only determines a wide class of activity networks, but also allows 
the realization of any structured project, since it provides basic constructions that are generally 
thought to suffice for the specification of how concurrent tasks are to interrelate. In other words: 

- an acyclic NDSAN embodies the notion of multiple concurrent execution threads, which may 
be started as a single thread branches out into several independent ones, and terminated as 
they coalesce into a single thread for further execution. 

- a decision NDSAN allows for nondeterministic switches, or decision points, to be incorporated 
into the course of a thread's execution. 

- a loop NDSAN allows any of the above to be iterated, possibly for a probabilistically selected 
number of times. 
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(b) 

Figure 3: Construction of a decision NDSAN. 
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Figure 4: Construction of a loop NDSAN. 
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3 Execution time of an NDSAN 



In this section we use the following terminology and notation. (See, for instance, @ [7J-) If X is 
a random variable, then Fx denotes the probability distribution function (PDF) of X, and fx the 
probability density function (pdf) of X. Recall that, for any t in the domain of X, Fx(t) = Pr(X < 
t). If X is a continuous variable, we have 



Hereafter, the random variable standing for the execution time of NDSAN D will be denoted by 
T[D]. This random variable can be determined as follows. 

Case 1: D is a trivial NDSAN 

Assuming that D consists of the activity node cij, we have T[D] = T^. 
Case 2: D is not a trivial NDSAN 

By 2.1, 2.2, and 2.3 in Definition[Tl T[D] can be recursively determined in terms of T[D\], T[D2], ■ ■ ■ , 



Case 2.1: D is an acyclic NDSAN 

Consider item 2.1 in Definition (TJ Let V be the collection of all directed paths from source(Do) 
to sink(Do). Let P E V, and write P = v^v^ . . . Vi. p ., where \P\ denotes the number of nodes of 
P. Let Di x , Di 2 , . . . , -Dj|p| be the NDSANs that substitute for , Vi 2 , . . . , Uj |f ,| . If Sp is the time 
required for the serial execution of , Di 2 , . . . , Di, p , , then 



fc=i 

(Recall that T[Di k ] is the random variable standing for the execution time of Dj, , 1 < k < \P\.) 



Since the Tf-DjJ's are independent random variables, the pdf fs P of Sp is given by the convolution 
of the pdfs h[D u \, fT[D io ], ■ ■ ■ , fT[Di, D .h that is, 

1 z \f\ 




(1) 



T[D V ]. 



U'\ 




(2) 



fs P (t) = (h[D H ] * h[D i2 ] *■■■* h[Di ])(*)• 



(3) 
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Define fi = f T [D n ] and f k = f k -i * /r[£> ifc ], 2 < k < \P\. Then we have, for any t, 

/•oo 

fk(t)= fk-i(t-x)f T[D] (x)dx and f Sp (t) = /ip,(t). (4) 
Jo 

Following Equation ([1]), the PDF of Sp is then given by 

Fs P (t) = [ fs P (x) dx. (5) 
Jo 

Having described the variables Sp for P £ V, the random variable T[D] is given by their maximum: 

T[D] = max S P . (6) 

Pev 

We remark that the variables Sp are not independent, because two distinct paths in V may have 
nodes in common. Hence the PDF of T[D] is given by 

F T[D] (t) = Pt(T[D] <t) = Pr(5 P < t for all P G V), (7) 

but no further simplification is in general possible. To determine the pdf of T[.D], simply apply 
Equation ([1]): 

f T[ D](t) = (F T[D] y(t). (8) 

Case 2.2: D is a decision NDSAN 

In Figure 3(b), assume that the decision node is dj. Then 014 = r/ — 2 and each node source(Dk) is 
selected by di with probability p\, k = 2, 3, . . . , r] — 1. Let Xi be a random variable associated with 
di in such a way that 

T[D2\ with probability p\ ; 
Xi=\ T i D 3\ with probability p\ ; (9) 

T[Z)^_i] with probability • 

Then, clearly, 

T[D]=T[D 1 }+X i + T[D V }. (10) 

In order to proceed, note that the events Xi = T[D k \, 2 < k < 77 — 1, are mutually disjoint, since 
they correspond to disjoint subdigraphs of D. We then have 

fxM = P2 fT[D 2 ](t)+pl f T [D 3 )(t) + ■■■ +p*_i /rp^jC*) (11) 
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and 

Fx t (t) = A F Tm (t) + pi F T[D[i] (*) + •••+ F T[Dv _ l} (t). (12) 

Thus, 

fT[D]{t) = (/t[Di] * /Xi * /t[Z>„])(*) (13) 

and, by Equation JTJ, 

*T[D](*) = / /r[D]M dx. (14) 
JO 

Case 2.3: D is a loop NDSAN 

In Figure 4(b), assume that the loop node is £{. For simplicity, assume also that /3$ = (3. Recall 
that, at the fcth passage through £{, the execution flow returns to source{D2) with probability 
q\,k = l,...,/3, where q l g = and /? is the maximum number of consecutive passages allowed 
through £i. 

Let Zk be the random variable standing for the total execution time of k serial independent execu- 
tions of L>2- Clearly, is the sum of k independent random variables, each one having distribution 
identical to that of T[D2\- Therefore, fz k and Fz k can once again be determined respectively by 
convolution and subsequent integration. 

Consider now a random variable Yi associated with di and such that 

with probability 1 — q\ ; 

Z\ with probability q\(l — q^) ; 

Z2 with probability q\q\{\ — q\); 



Y; 



(15) 



k Zp-x with probability q\q\ ■ ■ ■ q l pi , 
where the events Yi = 0, Yi = Z\, . . . , Y. L = Zp_\ are all mutually disjoint. Then 

T[D]=T[D 1 ]+Y i + T[D 3 ], (16) 

and the functions /t[d] anci Ft[d] can be obtained as in Case 2.2, since the definition of Y{ in 
Equation (|15p has the same structure as that of Xj in Equation (|9|). 
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4 Obtaining an approximate distribution of the execution time 



Given an NDSAN D, obtaining the distribution and density functions of the target random variable 
T[D] numerically may be an extremely costly computational task, even in simple cases. We refer 
the reader once again to the work by Leemis et al. [.9], where even small networks are seen to need 
an elaborate mathematical analysis. 

Our efforts are then directed toward seeking an approximate distribution of T[D] within some 
required confidence level. We base our approach on collecting a random sample formed by a suitable 
number N of independent observations of T[.D]. Let us denote such an approximate distribution 

N N N 

by jPypj • Once F T , D ^ is obtained, a frequency histogram and an approximate density / T r™ can be 
easily determined, as we discuss later. 

First, we present a simulation algorithm that, on input D, outputs a single observation t of the 
sample space of T[D\. Next, we deal with the question of how many times the simulation algorithm 

N 

must be repeated in order to obtain as required. 

4.1 Simulation algorithm 

The simulation algorithm is based on recursive references to subdigraphs, whose results are com- 
bined to obtain a single observation t of T[Z?]. The basis of the recursion occurs when D is a trivial 
NDSAN. 

For acyclic NDSANs (refer to item 2.1 in Definition 1 and to Figure 2(b)), a single observation of 
T[D] is obtained as follows: (i) Observations t\,t2, ■ ■ • , t n of T[D\], T[D2], ■ ■ ■ , T[D V ] are obtained 
recursively; (ii) Denote by Cu(t\,t2, ■ ■ ■ ,t„) the completion time of D when T[Di] = ti, 1 < i < rj; 
the determination of Cu(ti,t2, . . . ,t v ) can be done by assigning weight U to vertex Vi, 1 < i < 77, 
and then calculating the critical path of the resulting weighted digraph. 

The description of the simulation algorithm is as follows. 
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Sample(D) 



1 if D is a trivial NDSAN then 

2 let 04 be the single activity node of D 

3 return a single observation of Tj 

4 else if D is an acyclic NDSAN then 

5 let Dx, D 2 , . . . D v be NDSANs as in Figure 2(b) 

6 return Cd{ Sample(Z?i), Sample(Z?2), • • • , Sample(Z? ry ) ) 

7 else if D is a decision NDSAN then 

8 let Di, D 2 , ■ ■ ■ Aj be NDSANs as in Figure 3(b) 

9 let d{ be the decision node of D 

10 select k from {2, 3, . . . , rj — 1} 

11 return Sample(Di) + Sample(Dfc) + Sample(L' ?? ) 

12 else if D is a loop NDSAN then 

13 let Di,D 2 ,D 3 be NDSANs as in Figure 4(b) 

14 let li be the loop node of D 

15 select k from {0, 1, ...,/% — 1} 

16 t Zoop := 

17 repeat k times 

18 t/ 00 p := U 00 p + Sample(L> 2 ) 

19 return Sample(Di) + ti oop + Sample(I?3) 



We assume that obtaining the single observation in Line 3 can be done in constant time. We also 
assume that the selections in Lines 10 and 15 take constant time. Note that they are related to 
observations of the random variables JT, and Yi, respectively (see Equations Q and (|15p ). Then 
they must be made according to the probabilities expressed there. Calculating Cd in Line 6 takes 
0(m) time. (The critical path can be determined by a depth-first search starting at source(D).) 

Overall, the time complexity of the algorithm is determined by the maximum number of nested 
loop NDSANs in D. Suppose that D\, D 2 , ■ ■ ■ , D y is the longest sequence of subdigraphs of D such 
that: 

- D k is a loop NDSAN, 1 < k < 7; 

- Dk+i is a proper subdigraph of Dk, 1 < k < 7— 1. 
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Let (3 = max{/?j | 1 < i < ng}. Then in each at most (3 — 1 consecutive iterations are performed. 
Hence, the worst-case time complexity of the algorithm is 0(/3 7 m). Although 7 = O(n) and (3 
can be arbitrarily large, for most typical NDSANs the values of 7 and j3 are bounded by small 
constants. Thus the algorithm has, in practice, an 0{m) time complexity. 



4.2 Repeated executions of the simulation algorithm 

N 

Since -Ft[d] is a continuous variable, we may resort to the same statistic on i^mi as the Kolmogorov- 
Smirnov (KS) test. We refer the reader to [7] (Section 13.5) and to [8] (Section 3.3.1) for more 
details on what follows. 

Let t\, <2) • • • j t N be a random sample of T[.D], obtained by N independent executions of the simu- 

N 

lation algorithm. Define F T ^ as 

F T[D)( X ) = Jj • ( 17 ) 

The KS test is based on the difference between ^^(a;) and F T , D ^(x). To measure this difference, 
we form the statistic 

N 

K N = sup I F T[D] (x) - F T[D] (x) I (18) 

a;>0 

(hereafter referred to as the KS statistic) , which may be visualized as the maximum distance (error) , 
along the ordinate axis, between the plots of F T ^{x) and F T , D Jx) over the range of all possible x 
values. It can be shown (see [7], p. 346) that the distribution of K N does not depend on F T \ D y As 
a consequence, K N can be used as a nonparametric random variable for constructing a confidence 
band for F T r D -\. 

Let K £ N denote a value satisfying the relation 

Pr(K N <K e N ) = l-s (19) 
for some < e < 1. Following Equations (fT8|) and (fT9|) . we have: 



l-e = Pr( sup I F T[D] (x) - F T[D] (x) \ < K' N 

x>0 



N N 

Pr(iWx) - K* < F T[D] (x) < F Tm (x) + K% for all x > 0). (20) 



Pr( I F T[D] (x) - F T[D] (x) I < K e N for all x > 0) 

N N 

? T[D] (x)-K s N <F T[D] (x)<F T[D] 

The last equality in Equation ([20]) shows that the functions F T , D , (x) — K E N and F T , D , (x) + K E N yield 
a confidence band, with confidence level 1 — e, for the unknown distribution function Fx[d\{ x )- 
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Table 1: Some critical values K e for Km. 

AT 1 v 



N 


e = 0.20 


e = 0.10 


e = 0.05 


£ = 0.01 


10 


0.32 


0.37 


0.41 


0.49 


20 


0.23 


0.26 


0.29 


0.36 


30 


0.19 


0.22 


0.24 


0.29 


40 


0.17 


0.19 


0.21 


0.25 


50 


0.15 


0.17 


0.19 


0.23 


large 


1.07/VN 


1.22/VN 


1.36/y/N 


1.63/ y/N 



Some of the values K E N of the distribution of K N are given in Table 1 (see [7], p. 411). From Table 1 
we have, for example, K®q° = 0.15. Thus 

Pr(i^ 50 < A|o 2 °) = Pr(^ 50 < 0.15) = 1 - 0.20 = 0.80. (21) 

That is, by repeating the simulation algorithm N = 50 times, the probability that the error K N is 
at most 0.15 is 0.80. More accurate results can be obtained by using the last row of Table 1. For 
example, by requiring a maximum error 0.02 with confidence 95%, we have e = 0.05 and 

Vt(K n < K° N 05 ) = Pr(K N < 0.02) = 0.95. (22) 

For large N, Table 1 gives us K^ 05 = 1.36/VN. From 1.36/VN = 0.02 we conclude that N = 4624 
repeated executions of the simulation algorithm are needed in this case. 

We can summarize the application of the KS statistic as follows. 



1. Stipulate the maximum error e and the confidence level c. 

2. Set e = 1 — c and determine from Table 1 the value of N for which K £ N ~ e. 

3. Run the simulation algorithm N times and obtain a random sample t%, t2, ■ ■ ■ ,tjv> 

JV 

4. Let Frpyjy, he as in Equation (|T7|) . 

N 

5. If needed, an approximate density / T mi can be determined as follows, assuming t\ < ti < 
• • • <ttf. For some step value 5 > 0, let 

/rV|(' w ,) = F ' 10|(t r ) "r T "' D ' ( ' W ''' ,) - *-l.a.-.UWJ-l- (23) 

l l+kS l l+(k-l)S 

For instance, for 5 = 25 we compute the values 

n _ •^r[D](*26) — •^T[D](*i) ,JV . . _ ^T[D] (*5i) — ^T[D](*2e) 



L 26 L l 51 ^2 
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and so on. (We remark that better, nonparametric methods are available, as explained in 
|14j . for example.) 



5 Computational experiments 
5.1 A typical development process 

Figure 5 shows a simple, yet typical, development process represented by a NDSAN D with S a = 
{«!,..., a 2 7>, S b = {&!,..., b 8 }, S d = {di>, and S e = {£ 1} ...,£ r }. 

Table 2 describes the activity nodes, whose durations are expressed in days. Here, all Tj's follow 
triangular densities, which are suitable for describing single activities of a business or industrial 
process [5]. The pdf fx of a triangular variable X with parameters x\ < x 2 < x% is given by: 

0, x < xi; 



fx(x) 



y 



X2—X1 



X3-X2 



(x — Xl), X\ <X < X2', 



(x 3 -x), x 2 <x<x 3 ; 



(24) 



0, x > x 3 , 

where yo = - — . Table 3 shows the probabilities associated with the decision node d\, Table 4 
those associated with the loop nodes l\ through £ 7 . 

If we require a maximum error of 2% with confidence 95%, the KS statistic yields K^ 05 = 1.36/yN 
(see Table 1). From 1.36/y/N = 0.02, we conclude that = 4624 repeated executions of Sample(D) 
are required. Each of these executions can be represented by a tree of recursive calls, as follows. 
Let Di be the trivial NDSAN consisting of the activity node dj, 1 < i < 27, and, for i < j, 
let Dij be the NDSAN defined as the maximal connected induced subdigraph D' of D satisfying 
source(D') = a{ and sink(D') = aj. Figure 6 depicts the tree of recursive calls. For example, -05,27 
is a decision NDSAN, and in order to obtain a single observation of T[D^^i] we first recursively 
obtain observations of T[D^,],T[Dq],T[Dj^6], and T[L>27]- 

The frequency histogram of the resulting sample of T[D] is shown in Figure 7 for 1-wide bins. Each 
bin is an interval of the form (a, b] and abscissae in the figure give the values of b. The histogram 
suggests that /t[d] follows a bimodal pattern. Figure 7 also shows the fitting curve 

h{x) = 2115 lognorm(2.379610, 0.125138, x) + 2509 lognorm(3.853650, 0.072067, x), (25) 
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Table 2: Activity nodes of the NDSAN of Figure 5. 



Node 


Description 


Density parameters 


Ctl 


requirement analysis 


2, 4, 5 


02 


contract negotiation 


1, 2.5, 3.5 


a 3 


renegotiation 


1, 1.5, 2 


04 


contract conclusion 


0.5, 1, 1.5 


a 5 


contract presentation 


0.5, 1, 1.5 


a 6 


project abandonment 


0.5, 1, 1.5 


a 7 


system analysis 


4, 8, 12 


a 8 


system analysis refinement 


0.5, 2, 3 


a 9 


system analysis conclusion 


0.5, 1, 1.5 


aio 


division into modules 


0.5, 1, 1.5 


an 


1st module implementation 


4, 6, 12 


ai2 


1st module refinement 


1, 2, 3 


«13 


1st module conclusion 


0.5, 1, 1.5 


an 


2nd module implementation 


4, 6, 12 


ai5 


2nd module refinement 


1, 2, 3 


ai 6 


2nd module conclusion 


0.5, 1, 1.5 


017 


3rd module implementation 


4, 6, 12 


ai 8 


3rd module refinement 


1, 2, 3 


aig 


3rd module conclusion 


0.5, 1, 1.5 


O20 


module integration 


0.5, 1.5, 3 


021 


integration test 


1, 3.5, 4 


022 


error fixing 


0.5, 1, 1.5 


«23 


product deployment 


0.5, 1, 1.5 


024 


client test 


2, 4, 6 


«25 


error fixing 


0.5, 1, 1.5 


026 


production dispatch 


0.5, 1, 1.5 


027 


project documentation 


0.5, 1, 1.5 



Table 3: Probabilities associated with the decision node d\ in Figure 5. 



Node 


Description 


Outcome 


Next activity 


Probability 


di 


contract accepted? 


yes 




55% 






no 


a§ 


45% 
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Table 4: Probabilities associated with the loop nodes in Figure 5. 



Node 


Description 


Outcome 


Next activity 


1st iter. 


2nd iter. 


3rd iter. 


t\ 


negotiation finished? 


yes 




50% 


80% 


100% 






no 


A3 


50% 


20% 


0% 




use cases approved? 


yes 


a 9 


10% 


50% 


100% 






no 


a 8 


90% 


50% 


0% 


n 

h 


1st module passed: 


yes 


an 


20% 


50% 


100% 






no 


ai2 


80% 


50% 


0% 


h 


2nd module passed? 


yes 


am 


20% 


50% 


100% 






no 


ai5 


80% 


50% 


0% 


4 


3rd module passed? 


yes 


aig 


20% 


50% 


100% 






no 


ais 


80% 


50% 


0% 


4 


integration passed? 


yes 


d23 


60% 


80% 


100% 






no 


d22 


40% 


20% 


0% 


i 7 


client test passed? 


yes 


026 


20% 


50% 


100% 






no 


«25 


80% 


50% 


0% 



D 




Figure 6: Recursive calls invoked by Sample(D); D is the NDSAN of Figure 5. 
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Figure 7: Fitting curve drawn on the frequency histogram of T[D] for N = 4624 and 1-wide bins; 
D is the NDSAN of Figure 5. 

where lognorm(/i, a, x) is the density function of the log normal distribution [15J with parameters 
ix (the scale parameter) and a (the shape parameter): 

1 



lognorm(/i, a, x) 



(26) 



a\/2nx 

The function fi(x) is therefore proportional to the sum of two densities, the former yielding positive 
values over the range (7, 16], the latter over (37,60]. 

N N 

The approximate F T , D , and are shown in Figures 8 and 9 respectively, the latter with 5 = 25 
in Equation ([23|) , 



5.2 A paper reviewing process 



Figure 10 shows an NDSAN D representing the typical peer-review process of scientific publishing. 
Table 5 describes the activity nodes, whose durations are once again expressed in days. The Tj's 
follow truncated normal distributions. In the third column of Table 5, each line shows a pair Lii,o~f, 
standing for the mean and the variance of Tj, respectively. Each Tj is restricted to lie in the range 
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\pi — 3ai,fii + 3<Ti]. Table 6 shows the probabilities associated with the decision node d\, Table 7 
the probabilities associated with the loop nodes l\ and £2- 

For the same 2% error and 95% confidence as above, we give the results from N = 4624 repeated 
executions of Sample(D) in Figures 11 through 13. These figures show, respectively, the fitting 
curve f%(x) = 4624 lognorm(4. 965323, 0.421285, x) drawn on the frequency histogram of T[D] for 
1-wide bins, the approximate distribution of T[D], and the approximate density of T[D] (with 

5 = 25 in Equation (1231). 

6 Ongoing work 

The introduction of the constraint that each activity node requires certain amounts of finitely avail- 
able resources to execute gives raise to the so-called activity networks with constrained resources. 
The problem associated with such networks is known as RCPSP (Resource-Constrained Project 
Scheduling Problem) |2j. The RCPSP has many variations, but even the deterministic RCPSP 
with fixed activity durations is NP-hard [T]. 

Resource-Constrained NDSANs (RCNDSANs) combine stochastic activity durations, nondeter- 
minism, and constrained resources. We are currently targeting the simulation algorithm of RCND- 
SANs, based on iterating the combination of two phases as many times as necessary for accuracy. 
The first phase is responsible for obtaining a non-stochastic, deterministic instance of the input 
RCNDSAN, by selecting one of its possible execution paths. (Here, the term "path" stands for 
a plausible non-stochastic, deterministic scenario: a network represented by a directed acyclic 
graph with fixed topology and fixed activity durations.) The second phase consists of employ- 
ing a heuristic procedure for the solution of the deterministic RCPSP. The repeated execution of 
"path selection" combined with "scheduling heuristics" will generate close approximations to the 
probability distribution of the variables under analysis. 

We remark that our simulation algorithms turn out to be low-cost tools for the identification of the 
factors that most strongly influence completion time. After a simulation round, if needed, changes 
in the structure of the NDSAN/RCNDSAN under analysis can be proposed in order to improve 
its performance. Several simulation rounds may be rapidly performed until the desired efficiency is 
actually achieved. 
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Figure 10: An NDSAN representing a paper reviewing process. 
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Table 5: Activity nodes of the NDSAN in Figure 10. 



Node 


Description 


Mean, variance 


Ctl 


authors submit paper 


1, 0.1 


02 


editor sends paper to referees 1 and 2 


1, 0.1 


d3 


referee 1 processes the paper 


90, 45 


0,4 


referee 2 processes the paper 


90, 45 


0-5 


editor processes reports 


2, 0.2 


06 


editor sends reports to authors 


1, 0.1 


a-j 


authors perform modifications 


14, 7 


a 8 


editor sends revised version to referees 1 and 2 


1, 0.1 


09 


referee 1 processes revised version 


14, 7 


a±o 


referee 2 processes revised version 


14, 7 


an 


editor processes new reports 


2, 0.2 


«12 


editor checks agreement of reports 


1, 0.1 


ai3 


editor makes final decision based on two reports 


2, 0.2 


ai4 


editor sends paper to referee 3 


1, 0.1 


ai5 


referee 3 processes the paper 


90, 45 


016 


editor processes report of referee 3 


2, 0.2 


ai7 


editor sends report of referee 3 to authors 


1, 0.1 


ais 


authors perform modifications 


14, 7 


aig 


editor sends revised version to referee 3 


1, 0.1 


a.20 


referee 3 processes revised version 


14, 7 


an 


editor processes new report of referee 3 


2, 0.2 


a.22 


editor makes final decision based on three reports 


2, 0.2 


a.23 


editor sends final result to authors 


1, 0.1 



Table 6: Probabilities associated with the decision node d\ in Figure 10. 



Node 


Description 


Outcome 


Next activity 


Probability 


di 


referees agree? 


yes 


ai3 


75% 






no 




25% 



Table 7: Probabilities associated with the loop nodes in Figure 10. 



Node 


Description 


Outcome 


Next activity 


1st iter. 


2nd iter. 


3rd iter. 


h 


no need of modifications? 


yes 


ai2 


81% 


98% 


100% 






no 


a 6 


19% 


2% 


0% 


i2 


no need of modifications? 


yes 


a.22 


90% 


99% 


100% 






no 


an 


10% 


1% 


0% 
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Figure 11: Fitting curve drawn on the frequency histogram of T[D] for N = 4624 and 1-wide bins; 
D is the NDSAN of Figure 10. 
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Figure 12: Approximate distribution F T , D , for N = 4624; D is the NDSAN of Figure 10. 
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Figure 13: Approximate density f T ^ for N = 4624 and 5 = 25; D is the NDSAN of Figure 10. 
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